Introduction

In this report, a principal components analysis (PCA) is conducted on various environmental and climatic variables for countries worldwide. PCA is an ordination method that simplifies understanding and analysis of multivariate relationships by approximating original multidimensional space and reducing it to a two-dimensional visualization called a biplot. The data was compiled and provided by Zander Venter on Kaggle and acquired from publicly available remote sensing datasets uploaded to Google Earth Engine. The variables were calculated by taking the mean value for each country at a reduction scale of about 10km. The resulting biplot is utilized to determine correlations between variables.

Principal Components Analysis (PCA)

# Read in .csv file and clean up column names to lower snake case
world_envi <- read_csv(here("data", "world_env_vars.csv")) %>% 
  clean_names()
# PCA Data Wrangling
world_pca <- world_envi %>% 
  select(accessibility_to_cities:cloudiness) %>% 
  select(-c("aspect", ends_with("_quart"))) %>%  # deselect aspect variable and variables ending in "_quart"
  drop_na() %>%  # drop observations with an NA value
  scale() %>%  # scale values
  prcomp() # makes data frame into a list of info for PCA

# Create a dataset that drops NAs and contains all variables to use for adding aesthetics to biplot
world_complete <- world_envi %>% 
  drop_na()

# See the loadings (weighting for each principal component)
world_pca$rotation
##                                 PC1          PC2          PC3         PC4
## accessibility_to_cities -0.01208814  0.080994313  0.155465417  0.63242853
## elevation                0.12170299  0.065453461 -0.615470039  0.16161011
## slope                    0.00864602  0.217016648 -0.481304296  0.19220913
## cropland_cover           0.13351093  0.155728850  0.226416990 -0.45184318
## tree_canopy_cover       -0.29269549  0.205747138 -0.093500133 -0.07559820
## isothermality           -0.32878964 -0.155887442 -0.085683170  0.12450410
## rain_driest_month       -0.21438961  0.286285974  0.085438103  0.02228879
## rain_mean_annual        -0.34576890  0.145629407 -0.093835621 -0.06027416
## rain_seasonailty         0.02788678 -0.387636099 -0.173088940  0.08217680
## rain_wettest_month      -0.32609753  0.030137008 -0.168669406 -0.08849432
## temp_annual_range        0.35312644 -0.009461463 -0.088583074 -0.10523824
## temp_diurnal_range       0.14399954 -0.364057154 -0.217151433  0.02517740
## temp_max_warmest_month  -0.06144520 -0.430660760  0.053850184 -0.12001497
## temp_mean_annual        -0.25427246 -0.342709756  0.071569639 -0.02314108
## temp_min_coldest_month  -0.32182415 -0.230103452  0.101955929  0.01952353
## temp_seasonality         0.34155662  0.115176304 -0.009944495 -0.11240990
## wind                     0.12845176  0.062683168  0.387239201  0.49558356
## cloudiness              -0.22391303  0.301909855 -0.006242412 -0.08724119
##                                 PC5         PC6          PC7         PC8
## accessibility_to_cities -0.20940502 -0.34696009 -0.116565237 -0.57668122
## elevation                0.15587800 -0.03043558 -0.288499379  0.05179766
## slope                    0.33730599  0.31351559  0.369138172 -0.29290821
## cropland_cover           0.46601010 -0.17997139 -0.067095308 -0.56643051
## tree_canopy_cover       -0.34281384  0.02064494  0.110487281 -0.03275171
## isothermality            0.05708578  0.02813448 -0.346186167  0.01322909
## rain_driest_month       -0.21655722  0.49611096 -0.100839537 -0.18052830
## rain_mean_annual        -0.14604622 -0.03146021  0.193989201 -0.07688583
## rain_seasonailty         0.09581165 -0.35189766  0.380832253  0.05562498
## rain_wettest_month      -0.08015437 -0.31903191  0.387506144 -0.06479596
## temp_annual_range       -0.36239474 -0.02715552  0.101732215 -0.10087448
## temp_diurnal_range      -0.18678901  0.01382341 -0.317599982 -0.18875569
## temp_max_warmest_month  -0.20075996  0.17467682  0.133857258 -0.24036604
## temp_mean_annual         0.04669893  0.11643201  0.065112460 -0.11648158
## temp_min_coldest_month   0.18458683  0.11859813 -0.009021261 -0.05049371
## temp_seasonality        -0.33353806  0.02727738  0.213553448 -0.04367819
## wind                     0.20266648  0.10162202  0.263856054  0.23502191
## cloudiness              -0.03464879 -0.45158633 -0.196687259  0.18502412
##                                 PC9         PC10        PC11         PC12
## accessibility_to_cities  0.19599603  0.057678326  0.06494913 -0.003068228
## elevation               -0.21334184  0.173273287  0.26576447 -0.531465086
## slope                    0.19783507  0.060187373 -0.21181922  0.207785643
## cropland_cover          -0.25896624 -0.117488809  0.11599561 -0.126686911
## tree_canopy_cover        0.11130361 -0.543974450  0.40102411 -0.288396509
## isothermality           -0.13693475 -0.114643300 -0.07425794  0.179703108
## rain_driest_month       -0.45716341  0.243048393  0.30215832  0.303345580
## rain_mean_annual        -0.22985034  0.007408206 -0.22960692 -0.080573616
## rain_seasonailty        -0.25191662  0.037103442  0.53951155  0.391377681
## rain_wettest_month      -0.23353667 -0.083398594 -0.32794185 -0.101457133
## temp_annual_range       -0.13243023  0.102490747 -0.09108661 -0.031914306
## temp_diurnal_range      -0.36519764 -0.313152210 -0.34265751  0.134036791
## temp_max_warmest_month   0.02320349  0.366077948 -0.03954127 -0.300903434
## temp_mean_annual         0.04542687  0.182436258  0.05894965 -0.213864719
## temp_min_coldest_month   0.12077963  0.118595950  0.05242488 -0.140144463
## temp_seasonality        -0.01247159  0.180868925  0.03952129 -0.073261090
## wind                    -0.47319066 -0.087968891 -0.11118473 -0.311500422
## cloudiness              -0.08207496  0.492271218 -0.10214937  0.059657443
##                                 PC13          PC14         PC15          PC16
## accessibility_to_cities -0.112190521  1.011526e-03 -0.011256001 -0.0059854686
## elevation               -0.186584597  2.718716e-02  0.012178805  0.0120868353
## slope                    0.342044595 -4.069031e-02  0.005919763 -0.0160242983
## cropland_cover           0.093224360  9.011536e-02 -0.020118348 -0.0005612962
## tree_canopy_cover        0.403522931 -9.842885e-02  0.024973810  0.0293942978
## isothermality            0.248884846  7.559277e-01  0.122218299  0.0036921553
## rain_driest_month       -0.150021482 -1.415074e-01  0.185432289  0.0068543102
## rain_mean_annual        -0.218574491  1.307717e-01 -0.774763729  0.0268019872
## rain_seasonailty         0.060963991 -1.094782e-05 -0.127390211  0.0408172471
## rain_wettest_month      -0.297371974  5.599980e-03  0.564905357 -0.0096530169
## temp_annual_range        0.133289305  1.361336e-01  0.062085348  0.1587513455
## temp_diurnal_range       0.194998403 -3.838778e-01 -0.081833124 -0.1330264098
## temp_max_warmest_month   0.214150417  3.662095e-02  0.035468859  0.4633864044
## temp_mean_annual         0.055354205 -9.702331e-02  0.019596671 -0.7292994627
## temp_min_coldest_month   0.009590249 -9.076295e-02 -0.031029878  0.1264635379
## temp_seasonality         0.103248188  3.692409e-01 -0.021755957 -0.4348084151
## wind                     0.257613126  6.870954e-03  0.003747155  0.0159750403
## cloudiness               0.507819087 -2.234689e-01 -0.009505227 -0.0367356578
##                                 PC17          PC18
## accessibility_to_cities -0.007662261  1.984271e-11
## elevation               -0.019293334  6.529019e-10
## slope                    0.018967302 -8.151991e-10
## cropland_cover           0.002778203 -4.336500e-10
## tree_canopy_cover        0.004019026 -5.220946e-10
## isothermality            0.051579364 -1.677881e-09
## rain_driest_month       -0.012115021  1.529479e-09
## rain_mean_annual         0.054631754 -1.298559e-09
## rain_seasonailty        -0.020244153 -4.458399e-10
## rain_wettest_month      -0.083061048  1.598033e-09
## temp_annual_range        0.513346442  5.809137e-01
## temp_diurnal_range      -0.225311648  1.763533e-09
## temp_max_warmest_month  -0.047692221 -3.934759e-01
## temp_mean_annual         0.386515085 -5.566786e-10
## temp_min_coldest_month  -0.444852032  7.125419e-01
## temp_seasonality        -0.567014187  2.681082e-09
## wind                     0.005847516 -2.975414e-10
## cloudiness              -0.026337262  2.354167e-10

PCA Biplot

# Create a PCA biplot using `autoplot()` function (from ggfortify package)
world_biplot <- autoplot(world_pca,
         data = world_complete,
         colour = 'country',
         loadings = TRUE,
         loadings.label = TRUE,  # shows loadings of each variable
         loadings.colour = "gray50",
         loadings.label.colour = "black",
         loadings.label.vjust = -0.75) +
  theme_minimal() +
  theme(legend.position = "none")

# Make the graph interactive
ggplotly(world_biplot)

Figure 1. Biplot results for the PCA performed on various environmental and climatic variables (indicated by labeled arrows). The length of the line/arrow indicates variance in each principal component (PC1 and PC2) direction, with longer lengths indicating larger variance and the angle between the arrows indicating correlation. The points on the interactive biplot represent each country included in the study (hover over each point to see country name), and the closer the points are to each other the more similar the countries are overall in multivariate space. Data: compiled and provided by Zander Venter on Kaggle and acquired through Google Earth Engine.

Biplot Summary

Variables that are highly positively correlated have a 0 degree angle between their arrows, such as for:

Variables that are highly negatively correlated have a 180 degree angle between their arrows, such as for:

Variables that are not very correlated have an angle between their arrows close to 90 or 270 degrees, such as for:

The closer the countries (points) on the biplot are to each other, the more similar they are in regard to all environmental and climatic variables.